AITopics | optimal interpolation

On Optimal Interpolation in Linear Regression

Neural Information Processing SystemsDec-25-2025, 06:12:01 GMT

Understanding when and why interpolating methods generalize well has recently been a topic of interest in statistical learning theory. However, systematically connecting interpolating methods to achievable notions of optimality has only received partial attention. In this paper, we ask the question of what is the optimal way to interpolate in linear regression using functions that are linear in the response variable (as the case for the Bayes optimal estimator in ridge regression) and depend on the data, the population covariance of the data, the signal-to-noise ratio and the covariance of the prior for the signal, but do not depend on the value of the signal itself nor the noise vector in the training data. We provide a closed-form expression for the interpolator that achieves this notion of optimality and show that it can be derived as the limit of preconditioned gradient descent with a specific initialization. We identify a regime where the minimum-norm interpolator provably generalizes arbitrarily worse than the optimal response-linear achievable interpolator that we introduce, and validate with numerical experiments that the notion of optimality we consider can be achieved by interpolating methods that only use the training data as input in the case of an isotropic prior. Finally, we extend the notion of optimal response-linear interpolation to random features regression under a linear data-generating model.

linear regression, name change, optimal interpolation, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

RichSpace: Enriching Text-to-Video Prompt Space via Text Embedding Interpolation

Cao, Yuefan, Gong, Chengyue, Li, Xiaoyu, Liang, Yingyu, Sha, Zhizhou, Shi, Zhenmei, Song, Zhao

arXiv.org Artificial IntelligenceFeb-2-2025

Text-to-video generation models have made impressive progress, but they still struggle with generating videos with complex features. This limitation often arises from the inability of the text encoder to produce accurate embeddings, which hinders the video generation model. In this work, we propose a novel approach to overcome this challenge by selecting the optimal text embedding through interpolation in the embedding space. We demonstrate that this method enables the video generation model to produce the desired videos. Additionally, we introduce a simple algorithm using perpendicular foot embeddings and cosine similarity to identify the optimal interpolation embedding. Our findings highlight the importance of accurate text embeddings and offer a pathway for improving text-to-video generation performance.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.09982

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Oceania > Australia > New South Wales (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Transportation (0.30)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

On Optimal Interpolation in Linear Regression

Neural Information Processing SystemsJan-19-2025, 13:38:50 GMT

Understanding when and why interpolating methods generalize well has recently been a topic of interest in statistical learning theory. However, systematically connecting interpolating methods to achievable notions of optimality has only received partial attention. In this paper, we ask the question of what is the optimal way to interpolate in linear regression using functions that are linear in the response variable (as the case for the Bayes optimal estimator in ridge regression) and depend on the data, the population covariance of the data, the signal-to-noise ratio and the covariance of the prior for the signal, but do not depend on the value of the signal itself nor the noise vector in the training data. We provide a closed-form expression for the interpolator that achieves this notion of optimality and show that it can be derived as the limit of preconditioned gradient descent with a specific initialization. We identify a regime where the minimum-norm interpolator provably generalizes arbitrarily worse than the optimal response-linear achievable interpolator that we introduce, and validate with numerical experiments that the notion of optimality we consider can be achieved by interpolating methods that only use the training data as input in the case of an isotropic prior. Finally, we extend the notion of optimal response-linear interpolation to random features regression under a linear data-generating model.

linear regression, optimal interpolation, training data, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.64)

Add feedback

SPDE priors for uncertainty quantification of end-to-end neural data assimilation schemes

Beauchamp, Maxime, Desassis, Nicolas, Johnson, J. Emmanuel, Benaichouche, Simon, Tandeo, Pierre, Fablet, Ronan

arXiv.org Artificial IntelligenceFeb-2-2024

The spatio-temporal interpolation of large geophysical datasets has historically been adressed by Optimal Interpolation (OI) and more sophisticated model-based or data-driven DA techniques. In the last ten years, the link established between Stochastic Partial Differential Equations (SPDE) and Gaussian Markov Random Fields (GMRF) opened a new way of handling both large datasets and physically-induced covariance matrix in Optimal Interpolation. Recent advances in the deep learning community also enables to adress this problem as neural architecture embedding data assimilation variational framework. The reconstruction task is seen as a joint learning problem of the prior involved in the variational inner cost and the gradient-based minimization of the latter: both prior models and solvers are stated as neural networks with automatic differentiation which can be trained by minimizing a loss function, typically stated as the mean squared error between some ground truth and the reconstruction. In this work, we draw from the SPDE-based Gaussian Processes to estimate complex prior models able to handle non-stationary covariances in both space and time and provide a stochastic framework for interpretability and uncertainty quantification. Our neural variational scheme is modified to embed an augmented state formulation with both state and SPDE parametrization to estimate. Instead of a neural prior, we use a stochastic PDE as surrogate model along the data assimilation window. The training involves a loss function for both reconstruction task and SPDE prior model, where the likelihood of the SPDE parameters given the true states is involved in the training. Because the prior is stochastic, we can easily draw samples in the prior distribution before conditioning to provide a flexible way to estimate the posterior distribution based on thousands of members.

formulation, parametrization, uncertainty quantification, (14 more...)

arXiv.org Artificial Intelligence

2402.01855

Country:

South America > Chile (0.04)
North America > United States > New York (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry:

Energy (0.46)
Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Correcting public opinion trends through Bayesian data assimilation

Hendrickx, Robin, Arcucci, Rossella, Lopez, Julio Amador Dıaz, Guo, Yi-Ke, Kennedy, Mark

arXiv.org Artificial IntelligenceMay-29-2021

Measuring public opinion is a key focus during democratic elections, enabling candidates to gauge their popularity and alter their campaign strategies accordingly. Traditional survey polling remains the most popular estimation technique, despite its cost and time intensity, measurement errors, lack of real-time capabilities and lagged representation of public opinion. In recent years, Twitter opinion mining has attempted to combat these issues. Despite achieving promising results, it experiences its own set of shortcomings such as an unrepresentative sample population and a lack of long term stability. This paper aims to merge data from both these techniques using Bayesian data assimilation to arrive at a more accurate estimate of true public opinion for the Brexit referendum. This paper demonstrates the effectiveness of the proposed approach using Twitter opinion data and survey data from trusted pollsters. Firstly, the possible existence of a time gap of 16 days between the two data sets is identified. This gap is subsequently incorporated into a proposed assimilation architecture. This method was found to adequately incorporate information from both sources and measure a strong upward trend in Leave support leading up to the Brexit referendum. The proposed technique provides useful estimates of true opinion, which is essential to future opinion measurement and forecasting research.

evolutionary algorithm, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2105.14276

Country:

North America > United States (0.68)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Africa > East Africa (0.04)

Genre:

Research Report (0.64)
Questionnaire & Opinion Survey (0.46)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (0.56)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.46)

Add feedback

Determinantal Point Processes Implicitly Regularize Semi-parametric Regression Problems

Fanuel, Michaël, Schreurs, Joachim, Suykens, Johan A. K.

arXiv.org Machine LearningNov-13-2020

Semi-parametric regression models are used in several applications which require comprehensibility without sacrificing accuracy. Examples are spline interpolation in geophysics, or non-linear time series problems, where the system includes for instance a linear and non-linear component. We discuss here the use of a finite Determinantal Point Process (DPP) sampling for approximating semi-parametric models in two cases. On the one hand, in the case of large training data sets, DPP sampling is used to reduce the number of model parameters. On the other hand, DPPs can determine experimental designs in the case of the optimal interpolation models. Recently, Barthelm\'e, Tremblay, Usevich, and Amblard introduced a novel representation of finite DPP's. They formulated extended $L$-ensembles that can conveniently represent for instance partial-projection DPPs and suggest their use for optimal interpolation. With the help of this formalism, we derive a key identity illustrating the implicit regularization effect of determinantal sampling for semi-parametric regression and interpolation. Also, a novel projected Nystr\"om approximation is defined and used to derive a bound on the expected risk for the corresponding approximation of semi-parametric regression. This work naturally extends similar results obtained for kernel ridge regression.

approximation, matrix, regression, (14 more...)

arXiv.org Machine Learning

2011.06964

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Filters

Collaborating Authors

optimal interpolation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

On Optimal Interpolation in Linear Regression

RichSpace: Enriching Text-to-Video Prompt Space via Text Embedding Interpolation

On Optimal Interpolation in Linear Regression

SPDE priors for uncertainty quantification of end-to-end neural data assimilation schemes

Correcting public opinion trends through Bayesian data assimilation

Determinantal Point Processes Implicitly Regularize Semi-parametric Regression Problems